64 research outputs found

    Atar: Attention-based LSTM for Arabizi transliteration

    Get PDF
    A non-standard romanization of Arabic script, known as Arbizi, is widely used in Arabic online and SMS/chat communities. However, since state-of-the-art tools and applications for Arabic NLP expects Arabic to be written in Arabic script, handling contents written in Arabizi requires a special attention either by building customized tools or by transliterating them into Arabic script. The latter approach is the more common one and this work presents two significant contributions in this direction. The first one is to collect and publicly release the first large-scale “Arabizi to Arabic script” parallel corpus focusing on the Jordanian dialect and consisting of more than 25 k pairs carefully created and inspected by native speakers to ensure highest quality. Second, we present Atar, an attention-based encoder-decoder model for Arabizi transliteration. Training and testing this model on our dataset yields impressive accuracy (79%) and BLEU score (88.49)

    Neural Arabic Text Diacritization: State of the Art Results and a Novel Approach for Machine Translation

    Full text link
    In this work, we present several deep learning models for the automatic diacritization of Arabic text. Our models are built using two main approaches, viz. Feed-Forward Neural Network (FFNN) and Recurrent Neural Network (RNN), with several enhancements such as 100-hot encoding, embeddings, Conditional Random Field (CRF) and Block-Normalized Gradient (BNG). The models are tested on the only freely available benchmark dataset and the results show that our models are either better or on par with other models, which require language-dependent post-processing steps, unlike ours. Moreover, we show that diacritics in Arabic can be used to enhance the models of NLP tasks such as Machine Translation (MT) by proposing the Translation over Diacritization (ToD) approach.Comment: 18 pages, 17 figures, 14 table

    A Deep Learning Approach for Amazon EC2 Spot Price Prediction

    Get PDF
    © 2018 IEEE. Spot Instances (SI) represent one of the ways cloud service providers use to deal with idle resources in off-peak periods, where these resources are being auctioned at low prices to customers with limited budgets in a dynamic manner. However, SI are poorly utilized due to issues like out-of-bid failures and bidding complexity. Thus, effective SI price models are of great importance to customers in order to plan their bidding strategies. This paper proposes a deep learning approach for Amazon EC2 SI price prediction, which is a time-series analysis (TSA) problem. The proposed Long Short-Term Memory (LSTM) approach is compared with a well-known classical (i.e., non deep learning) approach for TSA, which is AutoRegressive Integrated Moving Average (ARIMA), using different accuracy measures commonly used in TSA. The results show the superiority of the LSTM approach compared with the ARIMA approach in many aspects

    Benchmarking open source deep learning frameworks

    Get PDF
    Deep Learning (DL) is one of the hottest fields. To foster the growth of DL, several open source frameworks appeared providing implementations of the most common DL algorithms. These frameworks vary in the algorithms they support and in the quality of their implementations. The purpose of this work is to provide a qualitative and quantitative comparison among three such frameworks: TensorFlow, Theano and CNTK. To ensure that our study is as comprehensive as possible, we consider multiple benchmark datasets from different fields (image processing, NLP, etc.) and measure the performance of the frameworks' implementations of different DL algorithms. For most of our experiments, we find out that CNTK's implementations are superior to the other ones under consideration

    Collusion attacks in Internet of Things: Detection and mitigation using a fog based model

    Get PDF
    © 2017 IEEE. This paper discusses the problem of collusion attacks in Internet of Things (IoT) environments and how mobility of IoT devices increases the difficulty of detecting such types of attacks. It demonstrates how approaches used in detecting collusion attacks in WSNs are not applicable in IoT environments. To this end, the paper introduces a model based on the Fog Computing infrastructure to keep track of IoT devices and detect collusion attackers. The model uses fog computing layer for real-time monitoring and detection of collusion attacks in IoT environments. Moreover, the model uses a software defined system layer to add a degree of flexibility for configuring Fog nodes in order to enable them to detect various types of collusion attacks. Furthermore, the paper highlights the possible overhead on Fog nodes and network when applying the proposed model, and claims that the Fog layer infrastructure can provide the required resources for the scalability of the model
    corecore